神经网络在与噪声扰动的图像分类中的精度较小。 CNN卷积神经网络以其在良性图像的分类中无与伦比的精度而闻名。但是我们的研究表明,它们极易受到噪声的攻击,而馈送前向神经网络,FNN与噪声扰动的对应性较小,几乎不受干扰地保持其准确性。观察到FNN可以更好地分类噪声密集的单通道图像,而这些图像只是人类视觉的巨大噪音。在我们的研究中,我们使用了以下架构的手写数字数据集,MNIST:具有1和2个隐藏层和CNN的FNN,带有3、4、6和8卷积,并分析了其准确性。 FNN脱颖而出表明,无论噪声强度如何,它们的分类精度超过85%。在我们通过此数据对CNN的分析中,CNN的分类准确性减速8卷积是其余CNN的一半。准确性趋势的相关分析和数学建模是这些结论的路线图。
translated by 谷歌翻译
Deep learning techniques with neural networks have been used effectively in computational fluid dynamics (CFD) to obtain solutions to nonlinear differential equations. This paper presents a physics-informed neural network (PINN) approach to solve the Blasius function. This method eliminates the process of changing the non-linear differential equation to an initial value problem. Also, it tackles the convergence issue arising in the conventional series solution. It is seen that this method produces results that are at par with the numerical and conventional methods. The solution is extended to the negative axis to show that PINNs capture the singularity of the function at $\eta=-5.69$
translated by 谷歌翻译
Graph Neural Networks (GNNs) have been widely applied to different tasks such as bioinformatics, drug design, and social networks. However, recent studies have shown that GNNs are vulnerable to adversarial attacks which aim to mislead the node or subgraph classification prediction by adding subtle perturbations. Detecting these attacks is challenging due to the small magnitude of perturbation and the discrete nature of graph data. In this paper, we propose a general adversarial edge detection pipeline EDoG without requiring knowledge of the attack strategies based on graph generation. Specifically, we propose a novel graph generation approach combined with link prediction to detect suspicious adversarial edges. To effectively train the graph generative model, we sample several sub-graphs from the given graph data. We show that since the number of adversarial edges is usually low in practice, with low probability the sampled sub-graphs will contain adversarial edges based on the union bound. In addition, considering the strong attacks which perturb a large number of edges, we propose a set of novel features to perform outlier detection as the preprocessing for our detection. Extensive experimental results on three real-world graph datasets including a private transaction rule dataset from a major company and two types of synthetic graphs with controlled properties show that EDoG can achieve above 0.8 AUC against four state-of-the-art unseen attack strategies without requiring any knowledge about the attack type; and around 0.85 with knowledge of the attack type. EDoG significantly outperforms traditional malicious edge detection baselines. We also show that an adaptive attack with full knowledge of our detection pipeline is difficult to bypass it.
translated by 谷歌翻译
Node classification for graph-structured data aims to classify nodes whose labels are unknown. While studies on static graphs are prevalent, few studies have focused on dynamic graph node classification. Node classification on dynamic graphs is challenging for two reasons. First, the model needs to capture both structural and temporal information, particularly on dynamic graphs with a long history and require large receptive fields. Second, model scalability becomes a significant concern as the size of the dynamic graph increases. To address these problems, we propose the Time Augmented Dynamic Graph Neural Network (TADGNN) framework. TADGNN consists of two modules: 1) a time augmentation module that captures the temporal evolution of nodes across time structurally, creating a time-augmented spatio-temporal graph, and 2) an information propagation module that learns the dynamic representations for each node across time using the constructed time-augmented graph. We perform node classification experiments on four dynamic graph benchmarks. Experimental results demonstrate that TADGNN framework outperforms several static and dynamic state-of-the-art (SOTA) GNN models while demonstrating superior scalability. We also conduct theoretical and empirical analyses to validate the efficiency of the proposed method. Our code is available at https://sites.google.com/view/tadgnn.
translated by 谷歌翻译
The primary obstacle to developing technologies for low-resource languages is the lack of representative, usable data. In this paper, we report the deployment of technology-driven data collection methods for creating a corpus of more than 60,000 translations from Hindi to Gondi, a low-resource vulnerable language spoken by around 2.3 million tribal people in south and central India. During this process, we help expand information access in Gondi across 2 different dimensions (a) The creation of linguistic resources that can be used by the community, such as a dictionary, children's stories, Gondi translations from multiple sources and an Interactive Voice Response (IVR) based mass awareness platform; (b) Enabling its use in the digital domain by developing a Hindi-Gondi machine translation model, which is compressed by nearly 4 times to enable it's edge deployment on low-resource edge devices and in areas of little to no internet connectivity. We also present preliminary evaluations of utilizing the developed machine translation model to provide assistance to volunteers who are involved in collecting more data for the target language. Through these interventions, we not only created a refined and evaluated corpus of 26,240 Hindi-Gondi translations that was used for building the translation model but also engaged nearly 850 community members who can help take Gondi onto the internet.
translated by 谷歌翻译
This work introduces the novel task of Source-free Multi-target Domain Adaptation and proposes adaptation framework comprising of \textbf{Co}nsistency with \textbf{N}uclear-Norm Maximization and \textbf{Mix}Up knowledge distillation (\textit{CoNMix}) as a solution to this problem. The main motive of this work is to solve for Single and Multi target Domain Adaptation (SMTDA) for the source-free paradigm, which enforces a constraint where the labeled source data is not available during target adaptation due to various privacy-related restrictions on data sharing. The source-free approach leverages target pseudo labels, which can be noisy, to improve the target adaptation. We introduce consistency between label preserving augmentations and utilize pseudo label refinement methods to reduce noisy pseudo labels. Further, we propose novel MixUp Knowledge Distillation (MKD) for better generalization on multiple target domains using various source-free STDA models. We also show that the Vision Transformer (VT) backbone gives better feature representation with improved domain transferability and class discriminability. Our proposed framework achieves the state-of-the-art (SOTA) results in various paradigms of source-free STDA and MTDA settings on popular domain adaptation datasets like Office-Home, Office-Caltech, and DomainNet. Project Page: https://sites.google.com/view/conmix-vcl
translated by 谷歌翻译
面向目标的生成脚本学习旨在根据目标生成后续步骤,这是帮助机器人进行日常生活的刻板印象活动的重要任务。我们表明,如果历史状态不仅被给人的语言指示捕获,而且还可以增强随附图像提供的其他信息,可以提高此任务的性能。因此,我们提出了一项新任务,多媒体生成脚本学习,以通过跟踪文本和视觉方式中的历史状态,并介绍包含2,338个任务和31,496个步骤的第一个基准,从而生成后续步骤。我们旨在生成视觉状态的脚本,这些脚本是可跟踪的,对看不见的任务的诱导性,并且在各自的步骤中多样化。我们建议通过多媒体选择性编码器编码视觉状态更改,并使用检索仪的解码器从先前观察到的任务中转移知识,并通过优化面向多样性的对比度学习目标来在每个步骤中介绍不同的信息。我们定义指标以评估发电质量和电感质量。实验结果表明,我们的方法明显优于强质基线。
translated by 谷歌翻译
视频检索随着视觉模型的发展取得了巨大进展。但是,进一步改进这些模型需要其他标记的数据,这是一项巨大的手动努力。在本文中,我们提出了一个框架MKTVR,该框架利用了从多语言模型的知识转移来提高视频检索的性能。我们首先使用最先进的机器翻译模型来构建伪真实的多语言视频文本对。然后,我们使用这些数据来学习视频文本表示,其中英语和非英语文本查询在基于预审前的多语言模型的常见嵌入空间中表示。我们在四个英语视频检索数据集上评估了我们提出的方法,例如MSRVTT,MSVD,DIDEMO和CHARADES。实验结果表明,我们的方法在所有数据集上实现了最先进的结果,超过了先前的模型。最后,我们还在涵盖六种语言的多语言视频回程数据集上评估了我们的模型,并表明我们的模型在零拍设置中优于先前的多语言视频检索模型。
translated by 谷歌翻译
钢铁生产行业中最紧迫的挑战之一是识别表面缺陷。早期鉴定铸造缺陷可以帮助提高性能,包括简化生产过程。不过,深度学习模型帮助弥合了这一差距并自动化了大多数此类过程,但需要提出轻量级模型,可以随着更快的推理时间轻松部署这些模型。这项研究提出了一种轻巧的体系结构,该体系结构在准确性和推理时间方面与复杂的预训练的CNN体​​系结构(如Mobilenet,Inception和Resnet)相比,在精度和推理时间方面有效,包括视觉变压器。已经实验了方法,以最大程度地减少计算需求,例如深度分离卷积和全球平均池(GAP)层,包括提高建筑效率和增强的技术。我们的结果表明,具有深度可分离卷积的590K参数的自定义模型优于预审计的架构,例如重新连接和视觉变压器的准确性(81.87%)(81.87%),并舒适地超越了诸如重置,inception和Vision Transformers等体系结构。推理时间(12毫秒)。 Blurpool表现出了其他技术的表现,精度为83.98%。增强对模型性能有矛盾的影响。在推理时间上,深度和3x3卷积之间没有直接相关性,但是,它们通过使网络能够更深入并减少可训练参数的数量来提高模型效率,从而在提高模型效率方面发挥了直接作用。我们的工作阐明了一个事实,即可以构建具有高效体系结构和更快推理时间的自定义网络,而无需依靠预训练的架构。
translated by 谷歌翻译
与单个决策树相比,Tree Ensemble(TE)模型(例如,增强的树木和随机森林)通常提供更高的预测性能。但是,由于人类难以理解其决策逻辑,因此TE模型通常缺乏透明度和可解释性。本文提出了一种新颖的方法,可以将经过训练的二进制分类任务的TE转换为规则列表(RL),该规则列表(RL)等同于TE,对于人类来说是可理解的。该RL捕获了TE决策的所有必要条件。基准数据集上的实验表明,与最先进的方法相比,(i)TE2RULES生成的RL的预测相对于原始TE具有很高的保真度,(ii)TE2RULES的RL具有高的解释性,由高可解释性衡量。决策规则的数量和长度,(iii)TE2RULES算法的运行时间可以大大减少,以稍低的保真度,(iv)RL是最新的替代品的快速替代 - 基于ART规则的实例级结果解释技术。
translated by 谷歌翻译